Combining the Sparsity and Unambiguity Biases for Grammar Induction
نویسنده
چکیده
In this paper we describe our participating system for the dependency induction track of the PASCAL Challenge on Grammar Induction. Our system incorporates two types of inductive biases: the sparsity bias and the unambiguity bias. The sparsity bias favors a grammar with fewer grammar rules. The unambiguity bias favors a grammar that leads to unambiguous parses, which is motivated by the observation that natural language is remarkably unambiguous in the sense that the number of plausible parses of a natural language sentence is very small. We introduce our approach to combining these two types of biases and discuss the system implementation. Our experiments show that both types of inductive biases are beneficial to grammar induction.
منابع مشابه
Unambiguity Regularization for Unsupervised Learning of Probabilistic Grammars
We introduce a novel approach named unambiguity regularization for unsupervised learning of probabilistic natural language grammars. The approach is based on the observation that natural language is remarkably unambiguous in the sense that only a tiny portion of the large number of possible parses of a natural language sentence are syntactically valid. We incorporate an inductive bias into gram...
متن کاملSparsity in Dependency Grammar Induction
A strong inductive bias is essential in unsupervised grammar induction. We explore a particular sparsity bias in dependency grammars that encourages a small number of unique dependency types. Specifically, we investigate sparsity-inducing penalties on the posterior distributions of parent-child POS tag pairs in the posterior regularization (PR) framework of Graça et al. (2007). In experiments w...
متن کاملیک مدل بیزی برای استخراج باناظر گرامر زبان طبیعی
In this paper, we show that the problem of grammar induction could be modeled as a combination of several model selection problems. We use the infinite generalization of a Bayesian model of cognition to solve each model selection problem in our grammar induction model. This Bayesian model is capable of solving model selection problems, consistent with human cognition. We also show that using th...
متن کاملSyntactic islands and learning biases: Combining experimental syntax and computational modeling to investigate the language acquisition problem
2 Abstract The induction problems facing language learners have played a central role in debates about the types of learning biases that exist in the human brain. Many linguists have argued that some of the learning biases necessary to solve these language induction problems must be both innate and language-specific (i.e., the Universal Grammar (UG) hypothesis). Though there have been several r...
متن کاملDeterministic Cooperating Distributed Grammar Systems
Subclasses of grammar systems that can facilitate parser construction appear to be of interest. In this paper, some syntactical conditions considered for strict deterministic grammars are extended to cooperating distributed grammar systems, restricted to the terminal derivation mode. Two variants are considered according to the level to which the conditions address. The local variant, which int...
متن کامل